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Abstract 

Background: Segment 6 of the ISA virus codes for hemoagglutinin-esterase (HE). This segment is highly variable, 
with more than 26 variants identified. The major variation is observed in what is called the high polymorphism 
region (HPR). The role of the different HPR zones in the viral cycle or evolution remains unknown. However viruses 
that present the HPRO are avirulent, while viruses with important deletions in this region have been responsible for 
outbreaks with high mortality rates. In this work, using bioinformatic tools, we examined the influence of different 
HPRs on the adaptation of HE genes to the host translational machinery and the relationship to observed virulence. 

Methods: Translational efficiency of HE genes and their HPR were estimated analyzing codon-pair bias (CPB), 
adaptation to host codon use (codon adaptation index - CAI) and the adaptation to available tRNAs (tAI). These 
values were correlated with reported mortality for the respective ISA virus and the AG of RNA folding. tRNA 
abundance was inferred from tRNA gene numbers identified in the Salmo salar genome using tRNAScan-SE. 
Statistical correlation between data was performed using a non-parametric test. 

Results: We found that HPRO contains zones with codon pairs of low frequency and low availability of tRNA with 
respect to salmon codon-pair usage, suggesting that HPR modifies HE translational efficiency. Although calculating 
tAI was impossible because one third of tRNAs (-60.000) were tRNA-ala, translational efficiency measured by CPB 
shows that as HPR size increases, the CPB value of the HE gene decreases (P = 2x1 0" 7 , p = -0.675, n = 63) and that 
these values correlate positively with the mortality rates caused by the virus (p = 0.829, P = 2x1 0" 7 , n = 1 1). The 
mortality associated with different virus isolates or their corresponding HPR sizes were not related with the AG of 
HPR RNA folding, suggesting that the secondary structure of HPR RNA does not modify virulence. 

Conclusions: Our results suggest that HPR size affects the efficiency of gene translation, which modulates the 
virulence of the virus by a mechanism similar to that observed in production of live attenuated vaccines through 
deoptimization of codon-pair usage. 



Background 

The infectious salmon anemia virus (ISAV) is an ortho- 
myxovirus, composed of eight segments. [1]. Two phylo- 
genetic groups have been identified, the European and 
the North American, which probably diverged at the be- 
ginning of the last century [2,3]. Within each group the 
viruses present minimal genetic differences with the 
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exception of certain regions of high variability localized 
in genes encoding for glycoprotein F [4] and HE. The re- 
gion of hyper-variability in HE genes (HPR) is located 
between the transmembrane region and the globular do- 
main with hemagglutinin-esterase activity [5,6]. The 
function of HPR within the viral cycle is unknown. How- 
ever, the type of HPR in the virus has been associated 
with the degree of observed virulence. For example, the 
HPRO is found in avirulent ISAV strains, while HPR7b is 
associated with ISAV outbreaks in Chile with high mor- 
tality rates [7-9]. The molecular ratios that explain the 
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relationship between the HPR and observed virulence 
have not yet been identified. However, the HPR does not 
modify esterase or hemagglutininin activity [10]. 

Far from being silent, synonymous mutations affect di- 
verse cellular processes, such as translation kinetics, 
mRNA structure and protein folding [11]. These muta- 
tions are distributed differently in the genes according to 
their levels of expression. Highly expressed genes tend 
to use preferentially the most abundant synonymous co- 
dons [12,13]. The relationship between bias in synonym- 
ous codon usage and gene expression seems to be the 
consequence of the synchrony observed between the fre- 
quency of codon usage and the abundance of tRNA that 
recognizes it [12,14-19]. 

As well as the bias in codon usage, there is a bias in 
codon-pair usage, which is characteristic for each organ- 
ism and independent of the frequency of codon usage 
[20]. As with the synonymous codons, the codon pairs 
are distributed in the genes differently according to their 
level of gene expression [21]. 

The ribosome traffic jam in the mRNA has been sug- 
gested as a common mechanism that explains the reduc- 
tion in translation efficiency by introducing codons with 
low tRNA availability, or tRNAs with little compatibility 
among themselves, as well as the negative relationship 
between the formation of stable structures in the mRNA 
and the level of protein expression [22-25]. 

There are few examples among viruses of the influence 
of codon use on the viral cycle. Studies about the adap- 
tation to host codon usage indicate that viral genes that 
codify for critical proteins tend to use the synonymous 
codons most represented in the host genome [26]. In In- 
fluenza A, the structure acquired by the 5 end of the co- 
difying zone is fundamental for viral packaging [27,28]. 
In the hepatitis A virus, the presence of underrepre- 
sented codons in the host genome seems to be a strategy 
to avoid competition with cellular mRNAs and to in- 
crease adequate folding of the capsid protein [29,30]. In 
the influenza A and polio viruses, the introduction of 
synonymous codon pairs in the viral genome with a 
negative bias produces viable attenuated viral particles 
with reduced rates of protein synthesis and replication 
[31-33]. 

It has been suggested that the different HPR are prob- 
ably the result of in-frame deletions from sectors of 
HPRO [6,34]. These deletions could modify translation 
efficiency by elimination of underrepresented codons in 
a similar but opposite mechanism to what is observed in 
the generation in vitro of live attenuated vaccines by 
deoptimization of their codon-pair usage. In this work 
we analyze this hypothesis of the effect of HPR on the 
adaptation to host codon usage, tRNA availability, host 
codon-pair usage bias and the folding energy of the HE 
gene. 



Results 

The effect of the HPR on translation efficiency 

Translation efficiency is governed by secondary structure of 
the mRNA [35] and by its adaptation to the tRNA pool 
[12]. To evaluate if the HPR affects virulence through the 
possible formation of a stable secondary structure that re- 
duces the advance of ribosome, thus causing a decrease in 
the HE translation rate, we determined if there is a relation- 
ship between the HPR folding energy and observed viru- 
lence. Our results indicate the absence of a statistically 
significant relationship between folding AG and the size 
of the HPR region or associated mortality. However, 
the HPRO region showed the most favorable energy 
(AG = -24.2 Kcal/mol) (Additional file 1: Figure SI). 

The effect of the presence of the different HPRs on 
translation efficiency can also be evaluated by analyzing 
the degree of adaptation to the tRNA pool through the 
use of the tAI value, which corresponds to the geometric 
mean of the relative abundance of tRNA in the cells that 
recognize the codons present in the gene. The relative 
abundance of the tRNA can be calculated based on the 
number of copies of each tRNA present in the genome 
[12]. Using the tRNAScan-SE program, we identified 
63857 genes in the Salmo salar genome that codify for 
tRNAs. Approximately 21085 of these genes codify for 
pseudogenes and another 20546 codify for alanine tRNA. 
Of the rest of the tRNAs, their abundances range from 
2724 for methionine tRNA to 236 for tryptophan tRNA. 
As well, 353 untyped tRNA were identified and 20 genes 
for tRNA suppressors (Additional file 2: Table SI). As in 
the calculation of tAI, the relative abundance of the 
tRNAs that recognize the codons in the gene was nor- 
malized with respect to the codon with greater availabil- 
ity of tRNAs. The extreme disproportion between Ala- 
tRNA copies and the other tRNA for the other amino 
acids makes interpretation of the data problematic and 
could lead to erroneous conclusions. To avoid this prob- 
lem, we continue the study using adaptation to host 
codon use and bias in codon-pair usage. 

Adaptation to host codon usage and host codon pair bias 
of the ISAV hemagglutinin-esterase gene 

The codon adaptation index (CAI) is a good indicator of 
the adaptation to the host codon usage and a good pre- 
dictor of translation efficiency [13]. Previous studies 
have indicated that ISAV is the least adapted ortho- 
myxovirus to it hosts genome, sharing CAI values with 
less than 3% of the genes of Salmo salar (Tello, unpub- 
lished). Among these, the North American virus pre- 
sents the lowest level of adaptation. If the HPR modifies 
translation efficiency through the accumulation of high 
or low frequency codons, it can be expected that the 
CAI values of hemagglutinin genes are proportional to 
the size of the HPR and the protein. 
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Among the 63 analyzed sequences, no statistically signifi- 
cant relationships (P < 0.05) were found between the CAI 
value of the HE gene and the size of the proteins or ob- 
served virulence, but a relationship was observed between 
the CAI value of the HPR and the size of this region 
(p=0.393, P=0.002, n=63). As the translation efficiency is af- 
fected by the presence of underrepresented codon pair 
usage in the genome, we analyzed bias in the codon pair 
usage of the HE genes (Figure 1). The results show that HE 
shares CPB values with 20% of the salmon genes 
(Additional file 1: Figures S2 and S3). A statistically signifi- 
cant relationship was identified (p= -0.675, P=2xl0" 7 , 
n=63) between CPB value and HE size. Of the 63 analyzed 
sequences, 3 were genes of Canadian viruses, which 
presented low CPB values. Among the viruses classified as 
European, the gene containing HPR0 had the lowest CPB 
value (Figure 1). This correlation increases when analyzing 
the relationship between the CPB value of the HPR region 
of the European isolates and the size of HPR region 
(p=-0.757, P=2xl0" 7 , n=60). 

When viruses were compared according to their coun- 
try of isolation, there were significant differences (£-test, 
P = 0.008) in the mean CPB values of the HE gene of 
the Norwegian (CPB = 0.0394±0.007) and Chilean vi- 
ruses (CPB = 0.0446±0.006), with the Chilean isolates 
presenting the higher CPB values (Additional file 1: 
Figure S4). When the CPB values of the HE gene are 
classified according to their HPR type, we note that the 
lowest CPB values are found in the HE genes containing 
HPR21 and HPR0, while the highest values are found in 
the HE genes containing HPR1, HPR15 and HRP7 
(Additional file 1: Figure S5). 
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Figure 1 Relationship between CPB and HE size. The figure 
shows the CPB values of 63 proteins with complete sequences in 
the encoding region of the HE gene. The CPB values are shown for 
HE genes in the Norwegian virus (white squares), North American 
virus (gray circles), the Chilean virus (black triangles) and the British 
virus (inverted gray triangles). The square with a star indicates the 
HE gene of the HPRO genotype. 



To assess the impact of the HPR region on the total 
CPB value of HE, we determined the QCPB value. The 
results of this analysis show that the HPRO, HPR9, 
HPR3, HPR16, HPR2, HPR5, HPR4, UC010, UC011 and 
HPR6 regions negatively affect the CPB value of the gene 
that carries them, while HPR7, HPR1, HPR14, HPR15, 
UC04 and HPR21 increase total CPB value (Additional 
file 1: Figure S6). With the exception of HPR9 and 
HPR3, the HPRO segment has the lowest CPB value, 
while HPR15, HPR7 and HPR1 present the highest CPB 
values (Additional file 1: Figure S7). The HPRO and 
HPR9 regions had the most negative effect on total CPB 
value. Because the HPROs had one of the lowest CPB 
values and most negatively affected total CPB value, we 
analyzed this region to find zones rich in codon pairs 
that are underrepresented in Salmo salar. This analysis 
shows that there are three points in the 343-344, 351- 
352 and 354-355 residues in the HPRO region that 
present CPS values < 0.8, that is, that these codon pairs 
are underrepresented by at least two times. Most of the 
points with low CPB values are absent in the rest of the 
HPR variants (Figure 2). Comparison among the CPS 
profiles of the different genes shows that the most 
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Figure 2 Codon-pair score in the HPR region. The figure shows 
an alignment scheme among the different HPR in the analyzed HE 
(Panel A). Panel B shows the codon-pair score (CPS) of the codon 
pairs in the HPR region of the ISAV HE-HPRO gene. The green lines 
identify the codon pairs with lower CPS. 
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significant difference in the carboxyl terminal profiles 
are found in the HPR region. The viral isolates in Chile 
with the HPR7b show profiles with greater magnitude, 
in contrast to what is observed in the North American 
viruses and ISAV SK779 (HPRO). In the gene with 
HPRO, the HPR zone presents a descending profile that 
breaks an ascending ramp of CPB values present in the 
HPR variant with important deletions in the HPR zone 
(Figure 3). 

Relationship among codon-pair bias, HPR type and 
virulence 

Our analysis consistently shows that the genes 
containing the HPRO and HPR7 regions have extreme 
CPB values. The two HPRs are associated with opposite 
phenotypes: the HPRO region is found in avirulent 
strains of ISAV, while the HPR7 region has been system- 
atically found in Chilean outbreaks that have been re- 
sponsible for eliminating 90% of salmon in production 
centers. To determine if there is a relationship among 
virulence, HPR type and codon-pair bias, we correlated 
our CPB values to the mortality rates reported by 
Mjeeland [36]. 

These mortality rates constitute valuable information 
given that the virulence of 11 isolates of the ISA virus 
with different HPRs were analyzed and compared. The 
results of this correlation show that the CPB values of 
the HE gene correlate positively with mortality by direct 
injection (p = 0.597, P = 0.0467, n = 11) and with mor- 
tality rates with fish-to-fish transmission (TG2, p = 
0.633, P=0.032, n=ll). When we use only the CPB value 
of the HPR region, the correlation with mortality of the 
cohabitants (fish inoculated with the virus by injection) 
increases (p = 0.829, P = 2xl0~ 7 , n = 11) (Figure 4). 
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Figure 3 Profile of CPS throughout the HE gene. The figure 
shows the profile of accumulated CPS values for the 63 HE genes. 
We highlight the Chilean isolates with HPR 7b, the Canadian isolates 
and the HPRO region. 
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Figure 4 Relationship between virulence and codon-pair bias 
(CPB). The figure shows the relationship between mortality of ISAV1 
and ISAV1 1 reported by Mjaaland, and the CPB values reached by 
their HPR regions. The statistical validity of the correlation was 
determined by the non-parametric Spearman rank order test. 



Extrapolation of mortality based on the CPB value of 
the HPR shows that the Norwegian viruses with HPR 15 
(AF378180.2 and AF302799.1) and the Chilean viruses 
with HPR1C (GU830908.1 and FJ594285.1) have an esti- 
mated mortality > 80%, which is higher than values ob- 
served in ISAV2 with HPR01. When we analyzed the 
CPB of the complete HE gene, we observed that most of 
the Chilean viruses have values between those of ISAV1 
(HPR14) and ISAV2 (HPR01). The CPB value of the HE 
gene from Chilean isolates with HRP7b has a value of 
0.122, which by extrapolation becomes a virus with simi- 
lar characteristics to HPR1 (CPB = 0.146) in the isolates 
of ISAV2. ISAV2 has the highest mortality rate by ISAV 
injection (75%) and the second highest rate by cohabit- 
ation (28.9%), followed by the HPR14 in the ISAV1 iso- 
late (46.1%) (Additional file 3: Table S2). 

Discussion 

The results presented in this work indicate the existence 
of a relationship between bias in the use of codon pairs 
and the size of the HPR region or the HE gene. In particu- 
lar, our analysis shows that the different types of HPR tend 
to eliminate regions that present underrepresented codon 
pairs (negative CPS index). In Influenza A and polio, the 
introduction of codon pairs with low CPS values is an ef- 
fective way to reduce translation of the viral gene, redu- 
cing the production of viral particles and virulence. In 
contrast, the introduction of codon pairs with positive 
CPS increases protein expression [31,32]. The molecular 
explanation for this relationship is not clear, but it is prob- 
ably because not all tRNA are compatible among them- 
selves upon simultaneously occupying the A and P 
ribosome sites, as multivariate analysis suggests [37]. 

A universal codon distribution pattern has recently 
been described. In general the genes tend to accumulate 
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codons poorly adapted to their tRNA pool among the 
first 50 codons, while the more adapted codons are 
found among the last 150 to 300 nucleotides [22]. It has 
been suggested that the presence of this adaptation ramp 
to the tRNA pool serves to avoid the ribosome traffic 
jam in the mRNA, the translation efficiency being in- 
verse to the quantity of ribosomes in the mRNA. Thus, 
the presence of underrepresented codons could reduce 
translational efficiency and the advance and turn-over of 
ribosomes [22-24]. 

The HPR region is located at approximately 80 codons 
from the carboxyl terminal with respect to the gene with 
HPRO. According to the ramp observed in eukaryotic 
[22], this region should contain codons more adapted to 
the tRNA pool. The presence of underrepresented codon 
pairs in the HPR zone could decrease the velocity of the 
advance of ribosomes by the incompatibility among 
tRNA, being functionally equivalent to the use of codons 
with low levels of available tRNA. On the other hand, 
the elimination of these codon pairs could increase 
translation efficiency by decreasing the ribosome traffic 
jam in this region. The Chilean isolates with HPR7b 
present a ZCPS profile with high CPS values and a con- 
tinuous ramp toward the carboxyl terminal (Figure 3), 
which is interrupted in the virus with HPRO. Consistent 
with this, viruses containing HPRO are associated with 
avirulent ISAV strains. On the other hand, HPR regions 
where codon pairs with negative CPS have been elimi- 
nated are associated with outbreaks with high mortality 
rates, such as HPR1 or HPR7b found in 80% of the 
pathogenic outbreaks in Chile [38,39]. The virulence 
values extrapolated for the Chilean isolates are consist- 
ent with the effect of these viruses on the Chilean sal- 
mon industry. 

Current efforts to cultivate the virus in vitro with the 
HPRO genotype have not been fruitful, which is why it 
remains unknown if these viruses have lower HE protein 
synthesis rates than viruses with the HPR7b genotype. 
Nevertheless, the correlations between the CPB values in 
the HE gene with the size of the region and the degree 
of pathogenicity of the ISA virus are consistent with the 
effect of HPR on translation efficiency. 

Both immune escape due to antigenic variation pro- 
duced by different HPRs [40] and modification of 
hemagglutinin-esterase activity can act as selection 
forces for the different HPR described. Recent experi- 
ments have determined that HPR does not modify 
hemagglutinin-esterase activity [10], suggesting that the 
biological activity of the protein does not explain the dif- 
ference in observed virulence among ISAV isolates. 
There are variations in the protein sequence of the gly- 
coproteins of influenza A that give rise to antigenic drift 
[41]. However, in influenza A, the variations occur by 
specific mutations that produce variations in the 



epitopes, in contrast to what occurs in ISAV, where the 
greatest variation observed in the HPR occurs by partial 
deletions in the HPRO [34]. In ISAV, lesser variations 
have been observed on the surface of the HE protein 
[39]. However, it is not known if these mutations have 
an important effect on antigen-antibody recognition. Al- 
though antigenic variation in the ISAV hemagglutinin- 
esterase could act as a selective force, in our opinion this 
concept makes more sense in populations whose individ- 
uals are long-lived and acquire immunity after surviving 
a viral infection or vaccination. Unlike humans, salmon 
are short-lived (four years) and suffer outbreaks of ISAV 
with high levels of mortality. Moreover, variants in the 
HPR region appeared prior to the development of vac- 
cines. Determining the degree of adaptation to the tRNA 
pool could help in confirming that the codons in the 
HPR region reduce translation efficiency. However, the 
estimation of the number of tRNA copies in Salmo salar, 
using the tRNAScan-SE program (which recognizes 
close to 100% of the tRNA in a genome), showed that 
Salmo salar has a large number of tRNA genes 
(> 65000), which is much higher than the approximately 
500 copies in humans and is on the order of the number 
found in other teleosts, like zebra fish (-45,000) [42]. 
Interestingly, close to a third of the identified tRNA is 
tRNA- Ala and another third is pseudogenes. This dis- 
proportionately high level of tRNA suggests the presence 
of a repeated SINE-type element, which presents hom- 
ologous regions to those of the tRNA [43]. Repeated 
SINE-type elements have been described in salmonoids 
and other teleosts, but with leucine and lysine tRNA 
[44-47]. This suggests the presence of a new SINE-type 
in Salmo salar, although further studies will be necessary 
once the S. salar genome has been completely annotated 
to characterize this type of element. Despite the signifi- 
cant disproportion of Ala-tRNA compared to the other 
tRNAs, the W values of the codons in the HPRO region 
show three areas, close to the area with negative CPS, 
where codons with low tRNA availability accumulate 
(Additional file 4: Table S3, Additional file 5: Table S4 
and Additional file 1: Figure S8). This reinforces the idea 
that HPRO is a zone of reduced translation efficiency. 

In summary, our analysis shows a correlation between 
the CPB value of the HE gene or the HPR region and 
the virulence of the ISA virus. To date, we have not de- 
termined if the CPB value in the ISAV is related to gene 
translation. However, in vitro data from influenza A and 
the poliovirus strongly suggest the existence of this rela- 
tionship. Although it is an over-simplification of the 
pathogenesis to explain virulence based solely on the 
adaptation of viral genes to the hosts codon pair usage, 
to our knowledge the correlations described in this work 
constitute the only natural example of de-optimization 
of codon pairs as the element correlating to the level of 
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pathogenicity of a virus and constitute a reliable explan- 
ation of the role of the HPR region in the ISA virus as a 
modulating factor of virulence. Without doubt, the de- 
velopment of a reverse genetic technique could verify 
this relationship, but until this technique becomes avail- 
able, sequence analysis is a useful tool 

Materials and methods 

Calculation of tAI 

We calculated tAI in accordance with what was 
established by dos Reis [12] (Equation 1 and 2), estimat- 
ing the relative abundance of the tRNAs in function of 
the number of copies of each tRNA in the Salmo salar 
genome. The number of copies of each tRNA was deter- 
mined with the tRNASCAN-SE program [48] in Linux, 
which was used to analyze the 555960 contigs in the 
GenBank database (Project ID: 72713) corresponding to 
the recently released salmon genome. 



W i = Y^(l-S i j)tGNC il 



(1) 



7=1 



where Wi is the relative adaptiveness of the codon i th to 
the tRNA pool, n { is the number of tRNA isoacceptors 
that recognize the i th codon, tGCNij is the number of 
copies of the j th tRNA gene that recognize the i th codon, 
and Sij is the selective constraint in the efficiency of 
codon-anticodon pairing. 



tAIn 




(2) 



The adaptation of a gene to the tRNA pool is calcu- 
lated according to equation 2, in which Wi is defined as 
the quotient between Wi and W max (Wi/W max ), ikg is 
the codon defined by the k th triplet of gene g, and l g is 
the length of gene g in codifying codons. 

Determination of bias in codon-pair usage 

Bias in codon-pair usage was determined according to 
what was described by Coleman [31]. To do this, we 
recalculated the codon-usage tables in Kazusa (http:// 
www.kazusa.or.jp/codon/), using the information in the 
codifying regions of the Salmo salar genes available in 
Genebank. The sequence of the codifying regions of 
these genes was downloaded from the RefSeq database 
and subsequently filtered to eliminate redundant genes 
or genes with ambiguous nucleotides or with incomplete 
codifying regions. A total of 3485 sequences passed 
through the filters, which represented the entrance 
of 1168431 codons. Based on this set of sequences 
(Additional file 6: Table S5), we calculated the codon 
and codon-pair frequency, using scripts in the language 
Python. The observed frequency of amino acid pairs was 



deduced based on the sum of all the codon pairs that co- 
dify for the same combination of amino acids. Subse- 
quently, we used these frequencies to calculate the CPS 
index (codon-pair score) using the equation described by 
Mueller et al. 2006 [33] (Equation 3), where f(AB) corre- 
sponds to the frequency of the appearance of the AB 
codon pair expressed in parts-per-thousands with re- 
spect to the total number of codon pairs in the genome. 
f(A) and f(B) correspond to the abundance of each of 
the individual codons expressed in parts-per-thousands 
with respect to the total number of codons in the gen- 
ome. f(X) and f(Y) corresponds to the amino acids codi- 
fied by the A and B codons respectively and f(XY) is the 
abundance of the amino acid pairs in all the proteins of 
the organism under study. The CPS values of the 3271 
codon pairs in Salmo salar can be found in the supple- 
mentary material (Additional file 7: Table S6). We calcu- 
lated the bias in codon usage as the arithmetic mean 
of the codon-pair score of a gene (CPS) described by 
Coleman et al. (2006) Equation 4, in which i is the i th 
codon pair of a gene and / is the total length of the gene 
expressed in the quantity of codified amino acids. 



CPS = Ln 



f(AB) f(X) xf(Y)\ 



(A)xf(B) f(XY) J 



CPB 



tjCPSj 
l-l 



(3) 



(4) 



Calculation of the CAI 

The codon adaptation index was calculated with the Em- 
boss program [49] based on the frequency of codon 
usage of Salmo salar. 

Prediction of free folding energy 

Free folding energy was calculated with the Mfold pro- 
gram through the web server (http://mfold.rna.albany. 
edu/?q=mfold) using the standard parameters. The en- 
ergy was calculated for the region between residuals 330 
and 371 with respect to HE SK779 HPRO. 

Virulence data 

The virulence of 11 ISA viruses with known genomes 
was obtained from data published by Mjaaland et al. in 
2005, in which they reported mortality rates 60 days 
after infection. This work analyzed the mortality rates of 
two fish populations from the same parents and the 
mortality rates of fish with genetic heterogeneity. The 
fish with the same parents were infected by cohabitation 
(TGI and TGII) with genetically heterogeneous fish that 
were previously infected by intraperitoneal injection 
(COH) [36]. 
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HE genes of ISAV 

The complete sequence of the codifying regions of the 
ISAV HE genes was downloaded from Genbank. All the 
genes that did not begin with ATG were filtered out, 
ending with a stop codon and ambiguous nucleotides. A 
total of 63 sequences passed through the filter. The 
GenBank codes of these sequences can be found in the 
supplementary material. 

Graphs and statistical methods 

The graphs and statistical tests were made using the 
program SigmaPlot 11. The significance of the differ- 
ences or correlations among the data groups obtained 
was evaluated with non-parametric tests (Rank Sum Test 
for comparing two groups or Spearman Rank Order for 
correlations) using a value of p < 0.05 as a cutoff. 
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